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Abstract 

The construction of the information capacity for the vector position parameter 
in the Minkowskian space-time is presented. This lays the statistical foundations 
of the kinematical term of the Lagrangian of the physical action for many held 
theory models, derived by the extremal physical information method of Frieden 
and Soffer. 
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1. Introduction 

The Fisher information (Ip ) is a second degree covariant tensor, which is 
one of the contrast functions defined on the statistical space S [H- It is the 
local version of the Kulback-Leibler entropy for two distributions mfinitesimally 
displaced from one another This, in turn, is the model selection tool used 

in the standard maximum likelihood (ML) method and the basic notion in the 
definition of the information channel capacity 0, 0| • 

The method of nonparametric estimation that enables the statistical selection 
of the equation of motions (or generating equations) of various field theory or 
statistical physics models is called the extremal physical information (EPI) . The 
central quantity of EPI analysis is the information channel capacity /, which is 
the trace of the expectation value of the Ip matrix. Fundamentally, it enters 
into the EPI formalism as the second order coefficient in the Taylor expansion 
of the log-likelihood function 0| . Originally, EPI was proposed by Frieden and 
Soffer 0]. They used two Fisherian information quantities: the intrinsic infor- 
mation J of the source phenomenon and the information channel capacity I, 
which connects the phenomenon and observer. Both J and 7, together with 
their densities, are used in the construction of two information principles, the 
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structural and variational ones. J and I are the counterparts^ of the Boltzmann 
and Shannon entropies, respectively, however, they are physically different from 
them 0, (3 ■ Finally, although in the Frieden and Soffer approach the structural 
information principle is postulated on the expected level, it is then reformulated 
to the observed one, together with the variational principle giving two coupled 
differential equations. 

In Q two information principles were also postulated, but with a different in- 
terpretation of the structural information, which is denoted by Q. In [f| the 
derivations, firstly of the observed and secondly of the expected structural in- 
formation principle from basic principles was given. It was based on the ana- 
lyticity of the logarithm of the likelihood function, which allows for its Taylor 
expansion in the neighborhood of the true value of the vector parameter G, 
and on the metricity of the statistical space S of the system. The analytical 
structure of the structural information principle and the geometrical meaning of 
the variational information principle, which leads to the Euler-Lagrange equa- 
tions, are discussed in [1, 0|. Both information principles made the EPI method 
the fundamental tool in the physical model selection, which is a kind of non- 
parametric estimation, having, as the output of the solution of the information 
principles the equations of motion or distribution generating equation. Their 
usage for the derivation of the basic field theory equations of motion is thus a 
fundamental one, as they anticipate these equations [1, HI, [HI • The fact that the 
formalism of the information principles is used for the derivation of the distri- 
bution generating equation signifies thai0 the microcanonical description of the 
thermodynamic properties of a compound system has to meet the analyticity 
and metricity assumptions as well. 

Thus, it is obvious from the former work that both the form of / and its den- 
sity 0, d, 3 play crucial roles in the construction of a particular physical model. 
Frieden, Soffer along with Plastino and Plastino put into practice the solution 
of the differential information principles for various EPI models. Previously, in 
the role of / was discussed in many field theory contexts and in 0, 0, ||J the 
general view on the construction of I was also presented. The main topic of the 
present paper will concentrate on the construction of / for field theory models 
with the estimation performed in the base space with the Minkowski metric. 
An important model parameter in the EPI analysis is the dimension N of the 
sample which via the likelihood function of the system enters into the channel 
information capacity I. The physical models form two separate categories with 
respect to N. In the first one, to which wave mechanics and field theories be- 
long, both N and I are finite |8j. Classical mechanics, on the base space y 
continuum, forms the second class^l having infinite N and thus infinite I. This 



1 In the sense that they are similar in relating them [||. 
2 In agreement with the Jaynes' principle [fj. 

3 In the contrast to e.g. wave mechanics, in classical mechanics the solution of the equation 
of motion does not determine (from this equation) the structure of a particle, which has to 
be determined independently at every point of the particle trajectory by the definition of its 
point structure, e.g. by the means of the 5-Dirac distribution. 
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fact was applied to prove the impossibility of the derivation of wave and field 
theory models from classical mechanics in [7]. In the case of the first category, 
the sample size iV is the rank of the field of the system Q. For example, with 
N = 8, the EPI method can result in the Dirac wave equation, whereas with 
N = 32 it can result in the Rarita-Schwinger one @ ■ In the realm of statistical 
physics, e.g. for N = 1, the equilibrium Maxwell-Boltzmann velocity law is ob- 
tained, while for N > 1, the non-equilibrium, although still stationary solutions 
(that otherwise follow from the Boltzmann transport equation) were discovered 
[2]. Since the observed structural information was obtained and the new in- 
terpretation of the structural information Q established [5, 9, 8], some of these 
models have been recalculated [g, 1(J 11 1. It appears that for every field, N 



can be related to the dimension of the representation of the group of symmetry 
transformation of the field in question [2]. 

The paper will also deal with the kinematic form of channel information capac- 
ity / expressed in terms of the point probability distributions of the sample. For 
N = 1, the usefulness of this form is perceived in the proof of the I-theorem 
@; which is the informational analog of the H-Boltzmann theorem. Then, 
the Fisher temperature, which is the sensitivity of the Fisher information to the 
change of the scalar parameter, can be defined. Thus, the I-theorem is in a 
sense more general than its older thermodynamic predecessor, as it can describe 
not only the thermodynamic properties of the compound system [T3| but also 
of an elementary one-particle system. 



2. The basics: Rao-Fisher metric and Rao-Cramer theorem 

Suppose that the original random variable Y takes vector values y S y and 
let the vector parameter 9 of the distribution p(y), in which we are interested 
be the expected parameter, i.e. the expectation value of Y: 

e = E(Y)= I dyp{y)y . (1) 

Jy 

Let us now consider the TV-dimensional sample Y = (Yi, Y2, ■■■,Yn) = (Y n )n=i> 
where every Y n is the variable Y in the n-th population, n = 1,2, N, which is 
characterized by the value of the vector parameter 9 n . The specific realization 
of Y takes the form y = (yi,y 2 , ... ,yjv) = (yn)n=n where every datum y„ is 
generated from the distribution p n (yn\@) of the random variable Y n , where the 
vector parameter is given by: 

e = (e 1 ,e 2} ...,e N ) T = (eX=i, (2) 
o n = (tfi„,iW..,i? fe „) T = ((0.)S=l)n • 

The set of all possible realizations y of the sample Y forms the sample space 
B of the system. In this paper, we assume that the variables Y n of the sample 
Y are independent. Hence, the expected parameter 9 n > = J B dyP{y\Q) y„< 
does not influence the point probability distribution p n {yn\0n) for the sample 
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index n' ^ n. The data are generated in agreement with the point probability 
distributions, which fulfill the condition: 

Pn(yn|©) = vJjn\Qn! , where n=l,...,N, (3) 
and the likelihood function P(y 0) of the sample y = (y n )n=i i s the product: 

N 

P(Q) = P(y\Q) = l[p n (y n \9 n ) . (4) 

n=l 

The set of values of = (0 n )^ =1 forms the coordinates of P(y 0), which is a 
point in d = k x N - dimensional statistical (sub)space S The number of all 
parameters is equal to d — k x N, but for simplicity, we will use the notation 
= 02, — , 9d) T = (0i)i=u where the index % = 1,2, d replaces the pair of 
indexes "sn". The likelihood function is formally the joint probability distribu- 
tion of the realization y = (y n )n=l °f the sam ple Y = (^n)^ii hence, P is the 
probability measure on B. The set of all measures £(£>) on B is the state space 
of the model. 

The Fisher information matrix: Let us now examine a subset S C on 
which the coordinate system (P)f =1 is given [l| so that the statistical space S 
is the d - dimensional manifolcQ Assume that B the d - dimensional statistical 
model: 

S = {P e = P(y\Q), = (9i)U eV e c 5R d } , (5) 

is given, i.e. the family of the probability distributions parameterized by d non- 
random variables (Oi)f =1 which are real-valued and belong to the parametric 
space Ve of the parameter 0, i.e. e Ve C Thus, the logarithm of the 
likelihood function InP : Vg> — > 3? is defined on the space Vg>. 
Let = (6i)f—i £ Ve be another value of the parameter or a value of the 
estimator of the parameter = (Oi)f =1 . At every point, Pq, the d x d - 
dimensional observed Fisher information (FI) matrix can be defined fl3L IBl 

dF(0) = -<9 l '<9MnP(0) = (-d l 'd l \nP{e)) (6) 

le=e 

and d l = d/ddi, d l = d/d9i, i, i' = 1, 2, .., d. It characterizes the local properties 
of P(y|0). It is symmetric and in field theory and statistical physics models with 
continuous, regular [l3| and normalized distributions, it is positively definite. 
We restrict the considerations to this case only. The expected dxd- dimensional 
FI matrix on S at point Pq [l| is defined as follows: 

I F (0) = E e (dF(0)) = f dyP(y\Q) dF(0) , (7) 

JB 



4 In this paper, we are interested in only the global coordinate systems. 
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where the differential element dy = d N y = dy\dy2---dyM ■ The subscript 9 in 
the expected value signifies the true value of the parameter under which the 
data y are generated. The FI matrix defines on S the riemannian Rao-Fisher 
metric g % 3 , which in the coordinates (Oi)f =1 has the form (<7*£(0)) := If- Under 
regularity and normalization conditions J B d N y P(9) = 1, [13|, we obtain J B dy 
P(9) <9 l lnP(9) = 0, i = 1,2,. .,d. Therefore, the elements of Ip can be 
rewritten as follows: 

fl «(0) = -E e (d l dnnP(y\®)) =- f dyP(y\Q)d i d^\nP(y\@) (8) 

JB 

= [ dyP(y\<d)d l \nP(y\0)d j \nP(y\@) 

JB 

= E e (d l \nP(y\e)dnnP(y\0)) , i,j = 1, 2, d , VP e G <S . 

Owing to the last line, IF = (dF 1 1 ) is sometimes recorded in the "quadratic" 
form: 

ff= (a 4 'lnP(9) 9MnP(6)) , (9) 

as it is useful in the definition of the a-connection on the statistical space <S [l[ . 

The multiparametric Cramer- Rao theorem: Let /p(0) be the Fisher in- 
formation matrix for = {9i)f =1 and §i — §i (y^J be an unbiased estimator 
of the distinguished parameter 9f. 

E e 9 t =ftj£S (10) 

and the values of the remaining parameters may also be unknown, i.e. they are 
to be estimated from the sample simultaneously with 0j. Then, the variance 
of §i fulfills the Cramer-Rao (CR) inequality o% (§^j > [ip 1 (6)] .. =: J| (9), 

where Ip 1 is the inverse of If [HI- Ip (9) is the lower bound in the Cramer-Rao 
(CRLB) inequality of the variance of the estimator Oi [13j|- 
Let I F = Ip(Q) and I F i = Ifh (Oi) — -fp«(0) denote the (i,i) elements of the 
matrix Ip 1 (9) and Ip (9), respectively. In the multiparametric case discussed, 
the pair of inequalities proceed [TH HI : 

o% (Oi) >P F >^-, where 1 < i < d , (11) 

where the first one is the CR inequality. If 9 is the scalar parameter, or if the 
Oi parameter is estimated only and the others are known, then in the second 
inequality in pip the equality P F = 1/Ifi remains. In this paper, we maintain 
the name of the Fisher information of the parameter Oi for Ipi regardless of 
whether the other parameters are simultaneously estimated. 
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3. The channel information capacity for the position random variable 

Let the vector value y G y of Y be the space position vector. It could 
be the space-time point y = (y v )l =0 of the four-dimensional Minkowski space 
y = 5ft 4 , which occurs in the description of the system, e.g. in wave mechanics, 
or the space point y = (y")f, =1 S y = 3? 3 in the three-dimensional Euclidean 
spac^l- Thus, y = {y u ) can possess the vector index v,p, = (0), 1, 2, 3, ... , where 
v, fi = 0, 1, 2, 3, ... in the Minkowski space and v, /i — 1, 2, 3, ... in the Euclidean 
one. In this analysis, we will use random variables with covariant and contravar- 
iant coordinates. The relation between them, both for the values of the random 
position vector and for the corresponding expectation values, is as follows: 

y* = X>„y / *, 9 v =Y t ^6", (12) 

n n 

where {r] v ^) is the metric tensor of the space y. In the case of the vectorial 
Minkowski index, we take the following diagonal form of the metric tensor: 

(r ? ^)=diag(l, -1,-1,-1,...) , (13) 

whereas for the Euclidean vectorial index, the metric tensor takes the form: 

( ?Ml )=diag(l,l,l,...) . (14) 

The introduction of the metric tensor 77 is important from the measurement 
point of view. In the measurement of the chosen v-th coordinate of the position, 
we are not able to exclude the displacements (fluctuations) of the values of 
the space-time coordinates which are orthogonal to it. This indicates that the 
expectation value of the z^-th coordinate of the position is not calculated in ([T]) 
from a distribution of the type p{y v ), but that it has to be calculated from 
the joint distribution p(y) for all coordinates y u . In addition, the measurement 
which is independent of the coordinate system is the one of the square length 
y • y and not of the single coordinate y v only, where "•" denotes the inner 
product defined by the metric tensor P^|) or (fT4"l) . 

The fact of the statistical dependence of the spacial position variables for the 
different indexes v should not be confused with the analytical independence 
which they possess. This means that the variable Y is the so-called Fisherian 
variable for which: 

Wfdy* = S; . (15) 

Let the data y = (y n )n=i De a realization of the iV-dimensional sample Y 
for the positions of the system, where y„ = (yjQ, n = 1,2, ...,N, denotes the 
n-th vectorial observation. Now, as the number of the parameters n , where 



5 In the derivation of the equations generating the distribution in statistical physics, y can 
be the value of the energy e of the system Q and then y = e £ y = 5ft. 
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n = 1,2, ...,N, agrees with the dimension N of the sample, and as v is the 
vectorial index of the coordinate y v n (and therefore the vector parameter has 
the additional vectorial index v, too) thus, we have: 

e = (d 1 ,6 2 ,...,e N ) , where 9 n = {QQ, v = (0), 1, 2, 3, ... , (16) 

where the expected parameter is the expectation value of the position of the 
system at the n-th measurement point of the sample: 

9 n ee E(Y n ) = (9" n ) , where 9" = [ dyP{y\@)y v n . (17) 

Jb 

Here for y = {y v f v=Q £JeS 4 , dy := d 4 yi ...d±y N and d 4 y« = dy°dyidy 2 n dyl . 

The statistical spaces S and Snx4- Let the discussed space be the Minkows- 
ki space-time y = 5R 4 . Then, every one of the distributions p n (yn\9 n ) is the 
point of the statistical model S = {p n (yn\&n)}, which is parameterized by the 
natural parameter, i.e. by the expectation value 9 n = (8„)l =0 = E(Y n ), as in 
(fTT|) . Consequently, the dimension of the sample space B and the dimension of 
the parametric space Vg> of the vector parameter 8 = (#^)^Li are equal to each 
othei0 and, as the sample Y is N x 4-dimensional random variable, hence the 
set Snx4 — {_Pri(2/|0)} is the statistical space on which the parameters (^)^ =1 
form the N x 4-dimensional local coordinate system. 

3.1. The Rao- Cramer inequality for the position random variable 

According to ([5]), the channel information capacity in the single n-th (mea- 
surement) information channel, i.e. the Fisher information for the parameter 
9 n is equal to: 

I Fn ee I Fn (9n) = J dyP(y\Q) ( V e „ lnP(j/|e) • V e „ lnP(j/|6) ) 
where the tensor (rf^) is dual to (?7„ p ), i.e. ri^^rf 1 ^ = 62, where 62 is 

/n=(0),l,2,... 

the Kronecker delta and Ve n = X) ^Vn ■ 

M=(0),l,2,... " 

Simultaneously, the variance of the estimator 9 n (y) of the parameter 9 n has the 
form: 

<J 2 {8 n ) = J dyP(y\e)(§ n (y)-9 n y(§ n (y)-9 n S j (19) 
dy P (y\G) J2 { § n (y) ~ °n) (K (y) - K) ■ 

v, /j=(0),l,2,... 



6 Nevertheless, let us remember that, in general, the dimension of the vector of parameters 
= (#i)^ =1 and the sample vector y = (y n )^ r _ 1 can be different. 
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Let us now observe that for the variables of the space position, the integrals 
in (fPBf and (fT9| are not to be factorized with respect to the vectorial z^-th 
coordinate. 

Now, as a consequence of ©, for every distinguished parameter 6 n , the 
variance a 1 (6 n ) of its estimator given by IT9|) is connected with the Fisher 
information Ip n — lFn{Qn) given by (|18p in the single information channel for 
this parameter by the inequality pip : 

X^X <^ <I F n where n = 1, 2, N , (20) 

where Ip is the CRLB for the parameter 6 n . 

The Stam's information: The quantity . 1 - - refers to the single n-th chan- 

nel. The Stam 's information Is [14 , |2| is obtained by summing over its index n 
and is equal to: 

N N 



where a 2 (9 n ) is given in (|19p . This is the scalar measure of the quality of the 
simultaneous estimation in all information channels. The Stam's information 
is by definition always nonnegative. As 9 n — (6%) is the vectorial parameter, 
Isn = 2 hj \ hself is the Stam's information of the (time-) space channels for 

the n-th measurement in the sample- 
Finally, summing the LHS and RHS of (|20[) over the index n and taking into 
account (|21l) , we observe that Is fulfills the inequality: 

N N 

< Is = J2 !sn < Ip n =■ 1 . ( 22 ) 

n— 1 n— 1 

where /, denoted in the statistical literature by C, is the channel information 



7 The appearance of the Minkowski metric in the Stam's information Is„ in (12 1 | i and 1191 1 
justifies the use of the error propagation law (and the calculation of the mean square of the 
measurable quantity) in the arbitrary Euclidean metric, that is, when in addition to the spacial 
indexes x*,, k = 1, 2, 3, the temporal index t occurs with the imaginary unit i |l5l. 
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capacity 0, 0| of the systenfy 

N 

I=Y, J Fn- (24) 

71=1 

The inequality (|22p is the 'minimal', trace generalization of the "single-channel" 
CR inequality pi)) . 

From the physical modeling point of view, the channel information capacity I is 
the most important statistical concept, which lays the foundation for the kine- 
matical terms [2( of various field theory models. According to pi) , it appears 
that both for the Euclidean (|14p and Minkowskian metric (|13p we perform the 
estimation in the case of positive Is only. Hence, from (|2"2"|) it follows that / is 
also non-negative. In (|5lZj) Section [5j it will be shown that / is non-negatively 
defined for the field theory with particles which have a non-negative square of 
their masses [T3, . This fact for the Minkowskian space ought to be checked in 
any particular field theory model, but from the estimation theory it is evident 
that: 

o 2 (L) > , (25) 
which is always the case for causal processes. 

Remark 1. The index of the measurement channel: The sample index n is 
the smallest index of the information channel in which the measurement is per- 
formed. It indicates that if the sample index might be additionally indexed, e.g. 
by the space-time index is, then it would be impossible to perform the measure- 
ment in one of the subchannels v assigned in that manner (without performing 
it simultaneously in the remaining subchannels which also possess the sample 
index n). The channel which is inseparable from the experimental point of view 
will be referred to as the measurement channel. 

Remark 2. In case of the restriction of the analysis to only a part of the mea- 
surement channel, one should ascertain that the value of the Stam's information 
which is the subject of the analysis is positive. For example, in the case of the 
neglect of the temporally indexed part of the space-time measurement channel, 



8 Thus the channel information capacity has the form: 

iV JV 3 „ JV 

1 = J2 If ™ = / d y p (y\®) Yl W ™Z = / d v p (y\®) = / d v i > ( 23 ) 

where i := P(8) J2n=l W 

nn is the channel information density [H l§l l^| ■ 
The channel information capacity Ip n is invariant under the smooth invertible mapping Y — ¥ 
X, where X is the new variable I 111 . It is also invariant under the space and time reflections. 
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the Stam's inequality obtained for the spacial components has the form: 



N 

< I S = ^ISn 

n=l 

n— 1 1 n— 1 

where the vector denotation means that the analysis of both in the sample space 
and in the parameter space has been reduced to the spacial part of the random 
variables and parameters, respectively. 

Remark 3. The symmetry conformity of Is and I: The error of the estima- 
tion of the expected position four- vector in the n-th measurement channel, and 
thus Is n = l/c 2 (0 n ), must be independent of the coordinate system in the 
Minkowski (or Euclidean) space. Therefore, Is n as defined in (j!?Tj) and (pjJ| is 
invariant under the Lorentz transformations (i.e. busts and rotations) for the 
metric tensor given by (fT3f or for metric tensor given by (fT4|) under the Galilean 
ones. 

As the measurements in the sample are independent, the channel information 
capacity / is invariant under the Lorentz transformation in the space with the 
Minkowskian metric, or under the Galilean transformation in the Euclidean 
space, only if every Ip n is invariant. The conditions of the invariance of Ig n 
and Ip n converge if in the inequalities given by (|20[) . the equalities are attained. 
More information on the invariance of the CRLB under the displacement, space 
reflection, rotation and affine transformation as well as unitary transformation 
can be found in [l8| . 

Remark 4. Minimization of I with respect to N: Every n-th term, n = 1, N, 
in the sum (|24|) brings, as the degree of freedom for /, its analytical contribu- 
tion. If only the added degrees of freedom do not have an effect on the already 
existing ones, then because every Ip n is non-negative, the channel information 
capacity / has an increasing tendency with an increase of N. The minimization 
criterion of / with respect to N was used by Frieden and Soffer as an additional 
condition in the selection of the equation of motion for the field theory model 
or the generation equation in the statistical physics realm Q. This means that 
values of N above the minimal allowable one lack a physical meaning for the 
particular application and leaving them in the theory makes it unnecessarily 
complex. Yet, sometimes part of the consecutive values of N are also necessary. 
Some examples were given in the Introduction. 

4. The kinematic information in the Frieden- Soffer approach 

The basic physical assumption of the Frieden-Soffer approach. Let 

the data y = (y n )n=i be the realization of the sample for the position of the 
system where y n = (y^)^ =0 G y = 5i 4 . In accordance with the assumption pro- 
posed by Frieden and Soffer 0| , their collection is carried out by the system alone 
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in accordance with the probability density distributions, p n (yn\@n)> n=l, ...,N. 
The content of the above assumption can be expressed as follows: The system 
samples the space-time which is accessible to it "collecting the data and per- 
formingthe statistical analysis", in accordance with some information principles 
(see SI, ! I i). 

Note: The estimation is performed by a human and the EPI method should be 
perceived as only a type of statistical analysis. 

The statistical procedure in which we are interested concerns the inference about 
Pn{Yn\9n) on the basis of the data y using the likelihood function P(y\Q). There- 
fore, using (U), we can record the derivatives of InP standing in Ip n in (fT8"|) as 
follows: 

8lnP(y\&) d " ^ 1 dp n (y n \0 n ) 

£ lnp„ (y n \9 n ) = ^ — -. . 7 (27) 



dO nv 89 r^,Pn{y n \O n ) 



and applying the normalization jy d 4 y n p n (y n \9 n ) = 1 of each of the marginal 
distributions we obtain the following form of the channel information capacity: 



' 8p n (y n \9 n ) 8p n (y n \9 n ) 



86" 88£ 



(28) 



Finally, as the amplitudes q n (yn\9 n ) of the measurement data y n e y are 
determined as in 

Pn{y n \B n )=:ql{y n \e n ) , (29) 



simple calculations give: 

t /V 1 / j4 \^ ua( 8q„ (y n \0n) 8q n (y n \9 n ) \ 

I = % y dYn {—8% W— ' (30) 

which is almost the key form of the channel information capacity of the Frieden- 
Soffer EPI method. It becomes obvious that by the construction the rank of 
the field of the system, which is the number of the amplitudes (q n (yn\9 n ))n=ii 
is equal to the dimension N of the sample. 

4--1- The kinematic form of the Fisher information 

The estimation of physical equations of motion using the EPI method 0, 0, 
[H, Q is often connected with the necessity of rewriting /, originally defined on 
the statistical space S, in a form which uses the displacements defined on the 
base space B of the sample. The task is performed as follows: Let x„ = (xjQ be 
the displacements (e.g. the additive fluctuations) of the data y„ = (yjQ from 
their expectation values 9^, i.e.: 

r n = K+<- 
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In accordance with (|T"5l) . the displacements x^ are the Fisherian variables. 
Appealing to the "chain rule" for the derivative Q : 

_d_ = d(y u n - K) g = _ 9 = _±_ (32) 
dK d9" n d(y- - K) ~ d& v n - K) d*Z ' [ ' 

and taking into account d 4 x„ = d 4 y„, we can switch from the statistical form 
(|30p to the kinematic form of the channel information capacity: 

TV 3 

I = 4 W d 4 x„ V f j»!^ghM!M , (33) 

where X n is the space of the x„ displacements. In rewriting (|30[) in the form of 
(|3"3")l . the equality: 

(34) 

has been used. 

Assuming that the range of the variability of all x^ is the same for every n, we 
can disregard the index n for these variables (but not for the amplitudes q n ) 
and thus obtain the formula: 

N 3 

/ = 4 V^ [ Jt vfj. ^g»(x + 9 n \ 6 n ) dg n (x + 9 n 1 9 n ) 

where X is the space of displacements x. The resulting formula (|35p indicates 
that the Fisher-Rao metric ([8]) on the statistical space S generates the kinetic 
energy metric. Let us now note that the kinetic term for every particular am- 
plitude q n enters with different 9 n , in principle. Hence, in fact N kinetic terms 
have been obtained, one for every amplitude q n (x + 9 n \9 n ). 
The EPI model from which, e.g. a particular field theory model appears, is usu- 
alljf^l built over the displacements space X, which in our case is the Minkowski 
space-time i? 4 . Although the statistical model S is transformed from being de- 
fined on the parameter space Vo = 5ft 4 to the space of displacements X = 5ft 4 , 
the original sample space B remains its base space both before and after this 
redefinition. 

In this way, the basic tool for the EPI model estimation connected with the 
derivation of both the generating equations of statistical physics 0, HH and the 
equations of motion for field theory models [2, [H, E3] , e -S- the Maxwell electro- 
dynamics Q, is obtained. 

A simplifying notation will now be introduced: 

q„(x) = qe n (x) = <7„(x + 6 n \e n ) , (36) 



9 The analysis of the EPR-Bohm experiment takes place on a parameter space V© @, @l- 
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which leaves the whole information on 6 n that characterizes ^ n (x -(- 9 n \6 n ) in 
the index n of the amplitude q n (x.) (and similarly for the original distribution 
p n (x)). With this simplifying notation, the formula ([33)1 can be rewritten as 
follows: 



which is understood in accordance with (|35|) . 

Note on the shift invariance: In the above derivation, the assumption 
used in [2j on the invariance of the distribution under the shift (displacement): 
p„(x„) ee p Xn (x n \0 n ) = p n {x„ + n \9 n ) = p n (y n \O n ) , where x£ ee y£ - 0%, has 
not been used. 

4-2. The probability form of the kinematic channel information capacity 

Here, the starting point is the form of / given by ([25)) . We have to move on 
to the additive displacements x„ ee (xJQ (|3"Tj) and use the "chain rule" (|32j) . As 
was mentioned previously, assuming that the range of the variability of all x^ 
is the same for every n and disregarding the variables index n, but leaving the 
information about 8„ in the subscript of the point distributions p n , we obtain the 
kinematic form of the channel information capacity expressed as the functional 
of the probabilities: 



where the simplifying notation p ra (x) ee j?g„(x) ee p„(x + n \8 n ), with the same 
meaning as (|36)) in (|37|) has been used. The above form is used as the primary 
one in the construction of / for Maxwell electrodynamics (see below) and in the 
weak field limit of the gravitation theory [8j . 

4-3. The channel information capacity for Maxwell electrodynamics 

The Frieden-Soffer EPI method of the derivation of the Maxwell equations 
of electrodynamics was presented in However, in Q the form of / is the 
Euclidean one. What was lacking for the construction of the Minkowski form of 
/ is the notion of the measurement channel presented in previous sections from 
which the fully relativistic covariant form of the channel information capacity 
required for the Maxwell model follows [||. Below, we present its construction 
from which the EPI method for the Maxwell equations gains its full Fisherian 
statistical validity. 

We begin with the channel information capacity written in the basic form 
given by ([38| . The proof that in order to obtain the Maxwell equations of 
motion, the field of the rank equal to N — 4 with real amplitudes q n , n = 1, 2, 3,4 




(37) 




(38) 
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has to be analyzed, is given in where one of the assumptions is that the gauge 
fields are proportional to these amplitudes: 

q u {x) = a A„(x) , where v = n — 1 = 0, 1, 2, 3 , (39) 

where a is a constant and the Minkowski sample index v is introduced. 

Using the Minkowski metric (r) utl ), we now define the amplitudes q v (x) which 

are dual to q v (x) : 

3 3 

g"(x) = Xy^(x) = a XyM,,(x) (40) 

= aA u {x) , where v = n — 1 = 0, 1, 2, 3 , 
where the dual gauge fields A^(x) have been introduced: 

3 

A'^sJifA^), where i/ = Q,l,2,3. (41) 

/X=0 

The amplitudes ^(x) are connected with the point probability distributions 
p n (x) in the following way: 

p„(x) = p qu (x) = q v (x)q v {x) (42) 
= a 2 A 1 ,(x)A 1 ,(x) , where v = n - 1 = 0, 1, 2, 3 . 

Consequently, we observe that in the EPI method applied to Maxwell electrody- 
namics, the sample index n becomes the space-time one. Thus, the form of the 
channel information capacity has to take into account the additional estimation 
in the space-time channels taking, in accordance with the general prescriptions 
of Section 13.11 the covariant form: 

'dp- (x) dp- (x) 



H=0 Jx !/=0 



Pq„ ( x ) t^o V <9x„ dx" 

/ a gM (x) a g " (x y 

\ dxy dx u 



3 „ 3 

. n « ft , . n 



Finally, in accordance with (|4"3"|) and (|4"0")) , the channel information capacity / is 
as follows: 



It must be stressed that the proportionality ^(x) = a-A„(x) and the normaliza- 
tion condition: 

3 3 

(1/4) £ f d 4 x^( x) = W d 4 x^(x) = 1 , (45) 

v=0^ X v=f)J X 
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where a = 2, pose a question on the meaning of the localization of the photon 
and the existence of its wave function. These points have recently been dis- 
cussed in the literature [19]. For example, the discussion in [3] supports the 
view that places the Maxwell equations on the same footing as the Dirac one; a 
fact previously noted by Sakurai [15]. It is worth noting that the normalization 
conditional (|4"5")l of the four-potential A v harmonizes the value of the propor- 
tionality constant a = 2 with the rank N = 4 of the light field whose value of a 
also occurred previously in Q but as the result of harmonizing the EPI resulj^l 
with the Maxwell equations. 

Now, both the variational and structural (see Introduction and [2, 8, 10]) infor- 
mation principles have to be self-consistently solved with the Lorentz condition 

3 

2 rfanA u = (46) 

additionally imposed. This task, with the proper form of the structural infor- 
mation for the Maxwell equations additionally found, is presented in [2J|. In 
that work, the Maxwell equations were obtained by solving the EPI information 
principles and a discussion of the solutions for the gauge fields can be also found 
there. What has been lacking is the construction of the Minkowski form of the 
channel information capacity, which was presented above. 

5. The Fourier information 

Let us now consider the particle as a system described by the field of the 
rank N, which is the setP^l of the amplitudes g n (x), n = 1, 2, N determined in 
the position space-time X of displacements x = (x M ) 3 =0 = (ci, x^x^x 3 ) as in 
Section I4TT1 that possesses the channel information capacity (I57|) . The complex 
Fourier transforms q n (p) of the real functions <?n(x), where p = (p At ) / ? = o = 
(— , p 1 , p 2 , p 3 ) is the four- momentum which belongs to the energy-momentum 
space V conjugated to the position space-time X, have the form: 

9»(P) = 7*-W/ ^9 n We' (E - xV ' )/S , (47) 
{2irhy J x 

where Y^t=o x ' 'P" = Et — X V' an d h is the Planck constant. 

The Fourier transform is the unitary transformation which preserves the measure 



10 The normalization 11451 of the four-potential A v to unity might not necessarily occur. The 
indispensable condition for <7t/(x) is that the necessary means can be calculated |20|. 

11 Which is the result of solving Q] the structural and variational information principles 
HHH together with d^A^ = 0, which is the Lorentz one. 

12 For the case of complex wave functions see Q. 
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on the L 2 space of functions integrable with the square, i.e.: 

f d 4 x q* (x) q m (x) = f d 4 p q* (p) q m (p) . (48) 
J x Jv 

Hence, applying the condition of the probability normalization^!: 

i N r 

-]T / rf 4 x^(x) = l, (49) 
n=l x 

we obtain (as the consequence of the Parseval's theorem) : 

JV JV 

- / = £ / d 4 P |<7„(p)| 2 = 1 , (50) 

n=l n=\ Jv 

where |g„(x)| 2 = g 2 (x) an d |<j„(p)| 2 = q*(p)q n (p). 
Using (|47| we can now record / given by (|37|1 as follows: 

JV 2 

/[?(x)]=J[?(p)] = r5 / d 4 P X>„(p)| 2 (— -p 2 ), (51) 
where p 2 = ^Li PfcP fe ■ 

The determination of the square of the particle mass: As the chan- 
nel information capacity / is from the definition the sum of the expected values 
(see (fl5)) and (|24p). the squared mass of a particle, defined as Q: 

1 r N 

,2 



jV 



l — I d 4 p^|g„(p)| 2 (^-p 2 ) (52) 

C J V n=l C 



is constant and does not depend on the statistical fluctuations of the energy E 
and momentum p, i.e. at least as the mean after the integration is performed. 
Thus, for a free particle we can record (|5TI) as follows: 

I [g(x)] = / [q{p)\ = AN (^) 2 = const . (53) 



13 Note on the displacement probability distribution in the system: Appealing to 
the theorem of the total probability, the probability density distribution of the displacement 
(or fluctuation) in the system can be written as follows Q: 

jv jv l N l N 

p(x) = £ p (x|0 n ) r = ^ Pn (x„|0„) r (S„) = - ]T g 2 ( Xft |0„) = — q£ (x) . 

n— 1 n — 1 n — 1 n — 1 

Function r = i can be referred to as the "ignorance" function as its form is a reflection 
of the total lack of knowledge on which out from JV possible values of 9 n appears in a specific 
n-th experiment in an ^-dimensional sample. 
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Therefore, we observe that the causality relation ([25]) entails / > in (f!?2]) and 
then, in accord with (l53l) the condition 



> 0, i.e. the lack of tachions in 
the theory 17J. It must be pointed out that, according to (f52j) , the zeroing of 
particle mass would be impossible for the Euclidean space-time metric (CO 



The Fourier information with q: The condition (|53l) means that: 

K F = I fe(x)] - I [q(p)] = , (54) 

what, using the constancy of 47V (t) 2 and (JSOJ) can be rewritten as the condition 
fulfilled by the free field of the rank N: 

K F = ( d 4 xk F =0 . (55) 
Jx 

Here, the quantity Kp defines the so-called Fourier information (F), where its 
density hp is equal to: 

N 
n=l 

Remark 5. Ignoring the fact that from the above calculations m 2 emerges 
as the mean (|52|). the equation (|54|). and consequently (|55|) . is the reflection 
of the Parseval's theorem and as such it is a tautology. This means that the 
Fourier transformation reflects the change of the basis in the statistical space 
S only. Therefore, by itself only the condition ([S3)) does not superimpose any 
new constraint unless AN {^j^-) 2 is additionally determined as the structural 
information of the system, which in this particular case defines the type of the 
field, the scalar one 0, 0, U 0, H • 



3 

u=0 0yiv 



(x) <9<7„(x) mc 2 2 



(56) 



6. Conclusions 

A system without a structure dissolves itself, hence its equation of motion 
requires besides the kinematical term (which is the channel information capac- 
ity /) a structural one. Intuitively, "during putting the structure upon" the 
information on the system has to be maximized and this is performed by the 
addition of the structural information Q, which acts under the condition of zero- 
ing of the observed structural information principle [7|,[5|, 8, 9]. Both the analysis 
of the analyticity of the log-likelihood function (B| and the Rao-Fisher metric- 
ity of the statistical (sub)space S [l[ allow for the formal construction of this 
observed differential form of the structural information principle [5J, [9j , which, 
when lifted to the expected level, connects the channel information capacity / 
with the structural information Q. However, when this structural differential 
constraint is established then from all distributions in S the one which mini- 
mizes total physical information K = I + Q is chosen. It is achieved via the 
variational principle I + Q — > min . 
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The above two information principles, i.e. firstly, the observed and expected 
structural information principle and secondly, the variational one, form the core 
of the EPI method of estimation [H,[8|,[j| (see Introduction). In order to estimate 
correctly the distribution of the system, the EPI method uses these principles, 
estimating the equation of motions for the particular field theory model, or the 
equation generating the distribution in the particular model of the statistical 
physics. The whole procedure depends greatly on the form of the channel in- 
formation capacity I used in the particular physical model, which has to take 
into account the geometrical and physical preconditions, among these the met- 
ric of the base space 3^ and the (internal) symmetries of the physical system. 
For example, different representations of the Lorentz transformation, which is 
the isometry of /, lead to its specific decomposition giving Klein-Gordon or 
Dirac equation @, Nonetheless, we cannot rule out the possibility that 

the opposite is also indisputable. For instance, from the basic property of the 
non- negativity of J, preceded by the causality property (f25j) . which is the 
statistical requirement of any successful observation, there follows the very im- 
portant physical property (|53p of the non-existence of tachion in the field theory 
models. 

In accord with Remark 5 underneath (|56[) . only when the structural information 
principle H, Q fulfils also the mass relation (|5^|) which is put as the constrain 
upon the Fourier transformation (|47l) . the entanglement^ among the momen- 
tum degrees of freedom in the system appears. 

Now, let us discuss a massless particle, e.g. photon. If, in agreement with Sec- 
tion 1331 the condition ([55l) - ([56]) is adopted for a particle with a mass m = 
and the amplitude is interpreted according to (|39|) and (j45|) as characterizing 
the photon, it would then necessitate answers to both the question on the na- 
ture of the photon wave function [lj| and the signature of the space-time metric 
from experiments. Indeed, in the Minkowski space-time metric (|13p . according 
to (j52|) . the only possibility for a particle to be massless is that it fulfills the 
condition E 2 /c 2 — p 2 = for all its Fourier monochromatic modes, if only the 
Fourier modes of massless particles are to possess a physical interpretation 
However, this condition means that the Fourier modes are not entangled (in 
contrast to the massive particle case) and in principle, the possibility to detect 
every individual mode should exist. If the individual Fourier mode frequen- 
cies of a light pulse were not detected, it would have meant that they are not 
physical objects [l9j|. This would lead to a serious problem for the quantum 
interpretation of a photon as the physical realization of a particular Fourier 



14 It follows from Section [5]that the Fourier transformation forms a kind of self-entanglement 
(between two representations of /, one of positions and the other of momentums) of the realized 
values of the variables of the system Q and leads to 1511 . Nevertheless, without putting the 
structural information principle on the system, which is the analyticity requirement of the 
log-likelihood function and without the Rao-Fisher metricity of S, the relation l|51|l remains 
a tautological sentence only. Indeed, in the case of the relation discussed in H51I I. the Fourier 
transformation has to relate the Fisher information matrix, which is in the second order term of 
the Taylor expansion of the log-likelihood function [5j with remaining parts of this expansion. 
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mode. This problem recently occurred in the light beam experiments discussed 
in (l9j . which suggests that the Fourier decomposed frequencies of a pulse do 
not represent actual optical frequencies, possibly implying that a real photon is 



a "lump of electromagnetic substance" without Fourier decomposition |19l. |21|. 
Finally, in general, the channel information capacity and the structural infor- 
mation form the basic concepts in the analysis of entanglement in the system. 
Nonetheless, e.g. for analyses of this phenomenon for the fermionic particle, the 
complex amplitude should sometimes be introduced 0, Q . 
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